Back to Article
Article Notebook
Download Source

Changing Traffic Patterns in Washington, DC: Fallout from the COVID-19 Pandemic

PPOL 6805: GIS for Spatial Data Science Final Project

Author
Affiliation

Holt Cochran

Georgetown University

Introduction

Washington, DC is frequently listed as having some of worst traffic in the United States (1). There are several factors that contribute to the large amount of traffic in the District: the large and dense population within the city limits packs people into a small land area; bottlenecks exist in certain areas of the city, such as the bridges crossing the Potomac River; drivers commuting from the sprawling suburban areas in Virginia and Maryland are funneled into tight corridors of the city; peak hours of the day, when people commute to and from work, escalate the traffic. Taken together, the District of Columbia experiences high rates of traffic, however there is no one single factor that contributes to the large build-up of drivers within the city (2) - rather, it is a confluence of characteristics of the city and drivers that contribute to vehicle backups.

When COVID-19 began spreading rapidly across the United States in March 2020, businesses, schools, and establishments closed to limit interactions and spread of the virus (3). People stayed home to shelter, and many functions of society were taken online in a remote setting (4). This caused an almost immediate stop in the number of people commuting into cities, including Washington, DC, for work, school, or other functions. The ability to work, learn, and exist in a remote capacity removed reasons for continuing to live in the same areas as before the pandemic - research shows that during the pandemic, large swaths of the US population relocated or moved to other areas (5).

Since the height of the COVID-19 lockdowns, society has reopened to in-person interactions, but remote functions of work, school, doctors visits, etc. still exist. My research for this project centers on how changes in human behavior during the COVID-19 pandemic have affected traffic patterns in Washington, DC and what city policymakers should do to address changes in traffic trends.

Hypothesis: Traffic has increased in Washington, DC since the reopening of schools, businesses, and return to office policies, but not to the same level as before the pandemic. However, do to mass relocation of people during the pandemic, traffic patterns and trends have changed in Washington, DC since COVID-19. Traffic has to become more spread out across the city, as people have moved and traffic overall has decreased, leading to new patterns in areas that did not previously have bad traffic. City policies need adjusting to address these changes and alleviate new traffic patterns and trends.

I anticipate that spatial autocorrelation is high before the pandemic, but drops off after the pandemic. This is due to more disperion in traffic, as people stay home more frequently and live farther outside of the city/in more residential areas possibly without large traffic patterns pre-pandemic.

Methodology

To research changes in traffic patterns in Washington, DC, I am using data from District of Columbia Department of Transportation (DDOT) from 2018 - 2022 (6). The data tracks traffic at a yearly level - these are reported in a statistic called Annual Average Daily Traffic (AADT), which is the average daily traffic a road experiences in a year. The unit of analysis are roads in the district, identified per road by a Route ID. Using data over time, I can examine traffic data in the two years before and after the height of the pandemic in 2020. Comparing traffic patterns over time will allow me to examine changes in trends, patterns, and quantities.

I first plot the traffic patterns on map to visually examine average traffic in DC by year. I then calculate Moran’s I for the traffic of roads for each year of data to assess spatial autocorrelation of traffic. Finally, I run Monte Carlo simulations to assess the statistical significance of the Moran I values. Taken together, these methods show if spatial autocorrelation of traffic exists or has changed over time in the city, as new roads have developed traffic patterns as a result of trends caused by the COVID-19 pandemic.

In [1]:
######## Load Packages ########
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(sf)
Warning: package 'sf' was built under R version 4.3.3
Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(spdep)
Warning: package 'spdep' was built under R version 4.3.3
Loading required package: spData
Warning: package 'spData' was built under R version 4.3.3
To access larger datasets in this package, install the spDataLarge
package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`
library(gganimate)
library(ggplot2)
library(leaflet)
library(gifski)
Warning: package 'gifski' was built under R version 4.3.3
library(purrr)
library(sf)
library(dplyr)
library(leaflet)
library(leaflet.extras)
Warning: package 'leaflet.extras' was built under R version 4.3.3
library(RColorBrewer)
library(viridis)
Loading required package: viridisLite
library(spatstat)
Warning: package 'spatstat' was built under R version 4.3.3
Loading required package: spatstat.data
Warning: package 'spatstat.data' was built under R version 4.3.3
Loading required package: spatstat.univar
Warning: package 'spatstat.univar' was built under R version 4.3.3
spatstat.univar 3.1-1
Loading required package: spatstat.geom
Warning: package 'spatstat.geom' was built under R version 4.3.3
spatstat.geom 3.3-4
Loading required package: spatstat.random
Warning: package 'spatstat.random' was built under R version 4.3.3
spatstat.random 3.3-2
Loading required package: spatstat.explore
Warning: package 'spatstat.explore' was built under R version 4.3.3
Loading required package: nlme
Warning: package 'nlme' was built under R version 4.3.3

Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

spatstat.explore 3.3-3
Loading required package: spatstat.model
Warning: package 'spatstat.model' was built under R version 4.3.3
Loading required package: rpart
spatstat.model 3.3-3
Loading required package: spatstat.linnet
Warning: package 'spatstat.linnet' was built under R version 4.3.3
spatstat.linnet 3.2-3

spatstat 3.3-0 
For an introduction to spatstat, type 'beginner' 
library(htmltools)
library(spatialreg)
Warning: package 'spatialreg' was built under R version 4.3.3
Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'spatialreg'

The following objects are masked from 'package:spdep':

    get.ClusterOption, get.coresOption, get.mcOption,
    get.VerboseOption, get.ZeroPolicyOption, set.ClusterOption,
    set.coresOption, set.mcOption, set.VerboseOption,
    set.ZeroPolicyOption
#rm(list=ls()) # remove objects


######## Import Data ######## 
dc_traffic_2018 <- st_read("Data/2018_Traffic_Volume.geojson")
Reading layer `2018_Traffic_Volume' from data source 
  `/Users/holtcochran/PPOL6805_final_project/Data/2018_Traffic_Volume.geojson' 
  using driver `GeoJSON'
Simple feature collection with 22835 features and 17 fields
Geometry type: MULTILINESTRING
Dimension:     XY, XYZ
Bounding box:  xmin: -77.11664 ymin: 38.79323 xmax: -76.90953 ymax: 38.99526
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84
dc_traffic_2019 <- st_read("Data/2019_Traffic_Volume.geojson")
Reading layer `2019_Traffic_Volume' from data source 
  `/Users/holtcochran/PPOL6805_final_project/Data/2019_Traffic_Volume.geojson' 
  using driver `GeoJSON'
replacing null geometries with empty geometries
Simple feature collection with 21434 features and 17 fields (with 45 geometries empty)
Geometry type: GEOMETRY
Dimension:     XY, XYZ
Bounding box:  xmin: -77.11664 ymin: 38.79323 xmax: -76.90953 ymax: 38.99526
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84
dc_traffic_2020 <- st_read("Data/2020_Traffic_Volume.geojson")
Reading layer `2020_Traffic_Volume' from data source 
  `/Users/holtcochran/PPOL6805_final_project/Data/2020_Traffic_Volume.geojson' 
  using driver `GeoJSON'
replacing null geometries with empty geometries
Simple feature collection with 7734 features and 17 fields (with 4 geometries empty)
Geometry type: LINESTRING
Dimension:     XY, XYZ
Bounding box:  xmin: -77.11664 ymin: 38.79323 xmax: -76.90953 ymax: 38.99236
z_range:       zmin: NA zmax: NA
Geodetic CRS:  WGS 84
dc_traffic_2021 <- st_read("Data/2021_Traffic_Volume.geojson")
Reading layer `2021_Traffic_Volume' from data source 
  `/Users/holtcochran/PPOL6805_final_project/Data/2021_Traffic_Volume.geojson' 
  using driver `GeoJSON'
replacing null geometries with empty geometries
Simple feature collection with 21702 features and 17 fields (with 170 geometries empty)
Geometry type: GEOMETRY
Dimension:     XY, XYZ
Bounding box:  xmin: -77.11664 ymin: 38.79323 xmax: -76.90953 ymax: 38.99526
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84
dc_traffic_2022 <- st_read("Data/2022_Traffic_Volume.geojson")
Reading layer `2022_Traffic_Volume' from data source 
  `/Users/holtcochran/PPOL6805_final_project/Data/2022_Traffic_Volume.geojson' 
  using driver `GeoJSON'
Simple feature collection with 8217 features and 16 fields
Geometry type: MULTILINESTRING
Dimension:     XY, XYZ
Bounding box:  xmin: -77.11664 ymin: 38.79323 xmax: -76.90953 ymax: 38.99236
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84
dc_traffic_2018 <- dc_traffic_2018 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

dc_traffic_2019 <- dc_traffic_2019 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

dc_traffic_2020 <- dc_traffic_2020 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

dc_traffic_2021 <- dc_traffic_2021 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

dc_traffic_2022 <- dc_traffic_2022 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

Exploratory Data Analysis (EDA)

I first plot the traffic data on a map to explore trends in traffic patterns at a high level. Colors of the roads indicate the amount of Average Annual Traffic by road, which darker shades of red indicating areas of higher traffic. The map is displayed by year, allowing for comparison of traffic between years.

Figure 1. Average Annual Daily Traffic (AADT) in Washington, DC

In [2]:
######## Plot Data Over Time with Leaflet ######## 
# Preprocess the data for each year (2017-2023)
dc_traffic_list <- list()

# Iterate through each year to preprocess
for (year in 2018:2022) {
  data_name <- paste0("dc_traffic_", year)
  dc_traffic <- get(data_name)
  
  # Remove Z-dimension and transform to CRS 4326
  dc_traffic <- dc_traffic %>%
    mutate(geometry = st_zm(geometry)) %>%
    st_transform(crs = 4326) %>%
    mutate(AADT = sqrt(AADT)) 
  
  dc_traffic_list[[year]] <- dc_traffic
}

##### Leaflet Plot ##### 

# Define a moderately bright custom color palette
bright_colors <- c("#4CAF50", "#FFEB3B", "#FF9800", "#F44336", "#D32F2F")  # Bright green, yellow, orange, red, and maroon

# Assuming dc_traffic_list is a list of the traffic data for each year, e.g., dc_traffic_list[[2017]], dc_traffic_list[[2018]], etc.

# Initialize the leaflet map with base tiles
leaflet_map <- leaflet() %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  setView(lng = -77.0369, lat = 38.895, zoom = 12)  # Center the map on Washington, DC

# Loop through each year's data
for (year in 2018:2022) {
  dc_traffic <- dc_traffic_list[[year]]
  
  # Create a color scale based on the AADT values (no transformation applied)
  color_scale_custom <- colorNumeric(palette = bright_colors,  
                                     domain = range(dc_traffic$AADT, na.rm = TRUE))
  
  # Add the data to the leaflet map
  leaflet_map <- leaflet_map %>%
    addPolylines(data = dc_traffic, 
                 color = ~color_scale_custom(AADT),  # Use the AADT values directly
                 weight = 2, 
                 opacity = 0.6, 
                 group = as.character(year)) %>%
    addLayersControl(
      overlayGroups = as.character(2018:2022),
      options = layersControlOptions(collapsed = FALSE),
      position = "bottomright"
    )
}

# Add the legend to the map
leaflet_map <- leaflet_map %>%
  addLegend(
    position = "bottomleft",   # Position of the legend
    pal = color_scale_custom,   # Color scale to use
    values = range(dc_traffic$AADT, na.rm = TRUE),  # AADT range for the legend
    title = "AADT (hundreds)",
    opacity = 1
  )

# Display the map
leaflet_map

I then examine the amount of traffic at a more granular level for each year, allowing for comparisons of the percentage of high-traffic routes for each year. Figure 2 displays a histogram of AADT per year breaking down traffic into larger categories for comparison.

Figure 2.

In [3]:
# filter data
dc_traffic_2018_filtered <- dc_traffic_2018 %>%
  select(ROUTEID, FROMDATE, AADT, geometry)
dc_traffic_2019_filtered <- dc_traffic_2019 %>%
    select(ROUTEID, FROMDATE, AADT, geometry)
dc_traffic_2020_filtered <- dc_traffic_2020 %>%
  select(ROUTEID, FROMDATE, AADT, geometry)
dc_traffic_2021_filtered <- dc_traffic_2021 %>%
  select(ROUTEID, FROMDATE, AADT, geometry)
dc_traffic_2022_filtered <- dc_traffic_2022 %>%
  select(ROUTEID, FROMDATE, AADT, geometry)
  
dc_traffic_all_years <- bind_rows(
  dc_traffic_2018_filtered %>% mutate(year = 2018),
  dc_traffic_2019_filtered %>% mutate(year = 2019),
  dc_traffic_2020_filtered %>% mutate(year = 2020),
  dc_traffic_2021_filtered %>% mutate(year = 2021),
  dc_traffic_2022_filtered %>% mutate(year = 2022)
)


# Filter the dataset to include only roads with AADT >= 50,000
dc_traffic_all_years_filtered <- dc_traffic_all_years %>%
  filter(AADT >= 50000)

# Create bins for AADT with a bin width of 50,000
dc_traffic_all_years_filtered <- dc_traffic_all_years_filtered %>%
  mutate(AADT_bin = cut(
    AADT,
    breaks = seq(50000, max(AADT, na.rm = TRUE), by = 50000),  # Bins with a width of 50,000
    include.lowest = TRUE,
    right = FALSE,
    labels = paste0(
      round(seq(50000, max(AADT, na.rm = TRUE) - 50000, by = 50000) / 1000, 1), "k-", 
      round(seq(100000, max(AADT, na.rm = TRUE), by = 50000) / 1000, 1), "k"
    )
  ))

# Plot the histogram of AADT by year with the 50,000 bins
AADT_Histogram <- ggplot(dc_traffic_all_years_filtered, aes(x = factor(year), fill = AADT_bin)) +
  geom_bar(position = "dodge", width = 0.8) +  # Standardize bar width
  labs(title = "Annual Average Daily Traffic (AADT) by Year",
       x = "Year", y = "Frequency") +
  theme_minimal() +
  scale_fill_brewer(
    palette = "YlGnBu",
    name = "AADT Bins"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

AADT_Histogram

Moran’s I

Next I calculate the Moran’s I for the Average Annual Daily Traffic for each road with data for each year from 2018 - 2022. Moran’s I statistics are used to quantify spatial correlation and assess clustering of data. I use Moran’s I to assess if traffic in DC is spatially auto correlated, and the strength of the clustering of traffic on roads. To reduce computational complexity, I filter to only assess AADT > 50,000 - this only includes roads that have average traffic of more than 50,000 drivers a day. This focuses the analysis on roads with large quantities of drivers and traffic, which is the focus of this research.

In [4]:
# filter for AADT > 100000
dc_traffic_2018_50k <- dc_traffic_2018_filtered %>% filter(AADT > 50000)
dc_traffic_2019_50k <- dc_traffic_2019_filtered %>% filter(AADT > 50000)
dc_traffic_2020_50k <- dc_traffic_2020_filtered %>% filter(AADT > 50000)
dc_traffic_2021_50k <- dc_traffic_2021_filtered %>% filter(AADT > 50000)
dc_traffic_2022_50k <- dc_traffic_2022_filtered %>% filter(AADT > 50000)

dc_traffic_all_years_50k <- bind_rows(
  dc_traffic_2018_50k,
  dc_traffic_2019_50k,
  dc_traffic_2020_50k,
  dc_traffic_2021_50k,
  dc_traffic_2022_50k
)


# Convert road geometries to Simple Feature objects if not already
dc_traffic_2018_sf <- st_as_sf(dc_traffic_2018_50k)

dc_traffic_2019_sf <- st_as_sf(dc_traffic_2019_50k)
dc_traffic_2019_sf <- dc_traffic_2019_sf %>%
  filter(!st_is_empty(geometry)) 

dc_traffic_2020_sf <- st_as_sf(dc_traffic_2020_50k)
dc_traffic_2020_sf <- dc_traffic_2020_sf %>%
  filter(!st_is_empty(geometry)) 

dc_traffic_2021_sf <- st_as_sf(dc_traffic_2021_50k)
dc_traffic_2022_sf <- st_as_sf(dc_traffic_2022_50k)

# Compute pairwise distances between geometries (in meters or desired units)
dist_2018 <- st_distance(dc_traffic_2018_sf)
dist_2019 <- st_distance(dc_traffic_2019_sf)
dist_2020 <- st_distance(dc_traffic_2020_sf)
dist_2021 <- st_distance(dc_traffic_2021_sf)
dist_2022 <- st_distance(dc_traffic_2022_sf)

threshold <- units::set_units(100, "m")


# Define a function to create spatial weights matrix
create_weights_matrix <- function(distance_matrix, threshold) {
  # Convert distance matrix to binary weights based on the threshold
  weights_matrix <- ifelse(distance_matrix <= threshold, 1, 0)
  weights_matrix[lower.tri(weights_matrix, diag = TRUE)] <- 0  # Remove self-links
  return(weights_matrix)
}

# Create spatial weights matrices for each year
weights_2018 <- create_weights_matrix(dist_2018, threshold)
weights_2019 <- create_weights_matrix(dist_2019, threshold)
weights_2020 <- create_weights_matrix(dist_2020, threshold)
weights_2021 <- create_weights_matrix(dist_2021, threshold)
weights_2022 <- create_weights_matrix(dist_2022, threshold)


# Function to compute Moran's I
compute_morans_i <- function(aadt_values, weights_matrix) {
  # Create a spatial weight list object from the matrix with style "W" (binary)
  nb <- mat2listw(weights_matrix, style = "W", zero.policy = TRUE)  # Allow zero neighbors
  
  # Compute Moran's I
  moran_i <- moran.test(aadt_values, nb)
  
  return(moran_i$estimate[1])  # Moran's I value
}

# Calculate Moran's I for each year
moran_i_2018 <- compute_morans_i(dc_traffic_2018_sf$AADT, weights_2018)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 9 sub-graphs
moran_i_2019 <- compute_morans_i(dc_traffic_2019_sf$AADT, weights_2019)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 9 sub-graphs
moran_i_2020 <- compute_morans_i(dc_traffic_2020_sf$AADT, weights_2020)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 5 sub-graphs
moran_i_2021 <- compute_morans_i(dc_traffic_2021_sf$AADT, weights_2021)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 5 sub-graphs
moran_i_2022 <- compute_morans_i(dc_traffic_2022_sf$AADT, weights_2022)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 7 sub-graphs
# Create a data frame to compare Moran's I across years
moran_comparison <- data.frame(
  Year = 2018:2022,
  Moran_I = c(moran_i_2018, moran_i_2019, moran_i_2020, moran_i_2021, moran_i_2022)
)
print(moran_comparison)
  Year   Moran_I
1 2018 0.7689117
2 2019 0.8398937
3 2020 0.8239006
4 2021 0.7395062
5 2022 0.7830352

Figure 3 displays Moran’s I comparison across years. There is peak Moran’s I value of 0.84 in 2019, with a clear dip to 0.74 in 2021.

Figure 3.

In [5]:
morans_time <- ggplot(moran_comparison, aes(x = Year, y = Moran_I)) +
  geom_line() +
  geom_point() +
  labs(title = "Moran's I Comparison Across Years", x = "Year", y = "Moran's I") +
  theme_minimal()
morans_time

Spatial Autoregressive (SAR) Model

I then a spatial autoregressive (SAR) model to test the statistical significance of spatial autocorrelation of the traffic in Washingon, DC. I run a spatial regression on the Average Annual Daily Traffic for pre-pandemic years (2018-2019) and post-pandemic (2021-2022). I exclude 2020 because this was when the COVID-19 pandemic spread across the world and lockdowns were implemented, so analyses could be skewed as people were forced to stay home.

Table 1 displays results from the pre-pandemic statistical model, while table 2 displays results from the post-pandemic model.

Table 1. Pre-Pandemic Spatial Regression Model (2018-2019)

In [6]:
# Pre-pandemic (2018-2019) data
pre_pandemic_data <- dc_traffic_all_years %>% filter(year %in% 2018:2019)

pre_pandemic_data <- st_as_sf(pre_pandemic_data)

pre_pandemic_data <- st_transform(pre_pandemic_data, crs = 6487)

# Sample data for pre-pandemic (if needed)
sampled_pre <- pre_pandemic_data %>%
  group_by(year) %>%
  slice_sample(n = 1000, replace = FALSE) %>%
  ungroup()

# Calculate centroids for pre-pandemic data
coords_pre <- st_coordinates(st_centroid(sampled_pre))
Warning: st_centroid assumes attributes are constant over geometries
# Pre-pandemic k-nearest neighbors (adjust k as needed)
nb_knn_pre <- knn2nb(knearneigh(coords_pre, k = 5))  # Adjust k if needed
Warning in knearneigh(coords_pre, k = 5): knearneigh: identical points found
Warning in knearneigh(coords_pre, k = 5): knearneigh: kd_tree not available for
identical points
# Convert to spatial weights matrices (row standardization)
listw_pre <- nb2listw(nb_knn_pre, style = "W")

# Fit SAR model for pre-pandemic period
sar_model_pre <- lagsarlm(AADT ~ year, data = sampled_pre, listw = listw_pre)
Warning in lagsarlm(AADT ~ year, data = sampled_pre, listw = listw_pre): inversion of asymptotic covariance matrix failed for tol.solve = 2.22044604925031e-16 
  reciprocal condition number = 4.19027e-18 - using numerical Hessian.
# Summary of pre-pandemic model
model_result_pre <- summary(sar_model_pre)
print(model_result_pre)

Call:lagsarlm(formula = AADT ~ year, data = sampled_pre, listw = listw_pre)

Residuals:
     Min       1Q   Median       3Q      Max 
-66415.6  -6761.0  -3183.2   2777.4 175323.0 

Type: lag 
Coefficients: (numerical Hessian approximate standard errors) 
               Estimate  Std. Error z value  Pr(>|z|)
(Intercept) -1155368.21   321299.28 -3.5959 0.0003232
year             575.77      159.18  3.6172 0.0002978

Rho: 0.50611, LR test value: 354.81, p-value: < 2.22e-16
Approximate (numerical Hessian) standard error: 0.023561
    z-value: 21.481, p-value: < 2.22e-16
Wald statistic: 461.43, p-value: < 2.22e-16

Log likelihood: -22248.56 for lag model
ML residual variance (sigma squared): 255630000, (sigma: 15988)
Number of observations: 2000 
Number of parameters estimated: 4 
AIC: 44505, (AIC for lm: 44858)

Table 2. Post-Pandemic Spatial Regression Model (2021-2022)

In [7]:
# Post-pandemic (2020-2022) data
post_pandemic_data <- dc_traffic_all_years %>% filter(year %in% 2021:2022)

post_pandemic_data <- st_as_sf(post_pandemic_data)

post_pandemic_data <- st_transform(post_pandemic_data, crs = 6487)

# Sample data for post-pandemic (if needed)
sampled_post <- post_pandemic_data %>%
  group_by(year) %>%
  slice_sample(n = 1000, replace = FALSE) %>%
  ungroup()

coords_post <- st_coordinates(st_centroid(sampled_post))
Warning: st_centroid assumes attributes are constant over geometries
# Post-pandemic k-nearest neighbors
nb_knn_post <- knn2nb(knearneigh(coords_post, k = 5))  # Adjust k if needed
Warning in knearneigh(coords_post, k = 5): knearneigh: identical points found
Warning in knearneigh(coords_post, k = 5): knearneigh: kd_tree not available
for identical points
listw_post <- nb2listw(nb_knn_post, style = "W")

# Fit SAR model for post-pandemic period
sar_model_post <- lagsarlm(AADT ~ year, data = sampled_post, listw = listw_post)
Warning in lagsarlm(AADT ~ year, data = sampled_post, listw = listw_post): inversion of asymptotic covariance matrix failed for tol.solve = 2.22044604925031e-16 
  reciprocal condition number = 5.62839e-18 - using numerical Hessian.
Warning in sqrt(diag(fdHess)[-1]): NaNs produced
# Summary of post-pandemic model
# Summary of post-pandemic model
model_result_post <- summary(sar_model_post)
print(model_result_post)

Call:lagsarlm(formula = AADT ~ year, data = sampled_post, listw = listw_post)

Residuals:
     Min       1Q   Median       3Q      Max 
-48684.6  -6404.7  -3466.8   2207.7 221038.3 

Type: lag 
Coefficients: (numerical Hessian approximate standard errors) 
              Estimate Std. Error z value Pr(>|z|)
(Intercept) -186735.55        NaN     NaN      NaN
year             95.88        NaN     NaN      NaN

Rho: 0.41641, LR test value: 175.84, p-value: < 2.22e-16
Approximate (numerical Hessian) standard error: 0.028647
    z-value: 14.536, p-value: < 2.22e-16
Wald statistic: 211.29, p-value: < 2.22e-16

Log likelihood: -22205.26 for lag model
ML residual variance (sigma squared): 249480000, (sigma: 15795)
Number of observations: 2000 
Number of parameters estimated: 4 
AIC: 44419, (AIC for lm: 44592)

Discussion

The Rho value is also smaller in the post-pandemic period (0.39893) compared to the pre-pandemic period (0.46811). This indicates that, after the pandemic, traffic became more dispersed and less clustered spatially, suggesting a potential shift in traffic patterns—areas with high traffic were not as strongly influenced by neighboring areas with high traffic

Conclusion